1,592 research outputs found
u-BeepBeep: Low Energy Acoustic Ranging on Mobile Devices
We present u-BeepBeep: a low energy acoustic ranging ser-
vice for mobile phones. -BeepBeep combines the efficacy of the basic BeepBeep ranging mechanism with a light-weight cross-correlation mechanism based on sparse approximation
Acoustical Ranging Techniques in Embedded Wireless Sensor Networked Devices
Location sensing provides endless opportunities for a wide range of applications in GPS-obstructed environments;
where, typically, there is a need for higher degree of accuracy. In this article, we focus on robust range
estimation, an important prerequisite for fine-grained localization. Motivated by the promise of acoustic in
delivering high ranging accuracy, we present the design, implementation and evaluation of acoustic (both
ultrasound and audible) ranging systems.We distill the limitations of acoustic ranging; and present efficient
signal designs and detection algorithms to overcome the challenges of coverage, range, accuracy/resolution,
tolerance to Doppler’s effect, and audible intensity. We evaluate our proposed techniques experimentally on
TWEET, a low-power platform purpose-built for acoustic ranging applications. Our experiments demonstrate
an operational range of 20 m (outdoor) and an average accuracy 2 cm in the ultrasound domain. Finally,
we present the design of an audible-range acoustic tracking service that encompasses the benefits of a near-inaudible
acoustic broadband chirp and approximately two times increase in Doppler tolerance to achieve better performance
RECURRENCE RISK OF INFERIOR SURFACE LEUKOPLAKIA OF THE VOCAL CORDS: A RETROSPECTIVE STUDY
Background:
Vocal fold leukoplakia (VFL) remains a diagnostic and therapeutic challenge despite our knowledge of its etiopathogenetic factors and the development of laryngeal visualisation. This study sought to identify lesions on the inferior surface of the vocal folds as a recurrence risk factor.
Methods:
This was a retrospective study with two years of data collection. The study included 37 VFL patients, who were separated into nonrecurrent and recurrent categories. Each patient's clinicopathological characteristics and surgical procedures were scrutinised.
Results:
15 (40.5%) of the 37 patients exhibited residual (3) or recurrent (12) VFL. 8 of 12 (66.7%) patients with recurrence and 6 of 22 (27.3%) patients without recurrence had inferior surface lesions of the vocal fold at the time of the initial operation (P =.036). Significantly more recurrences occurred in patients with inferior surface lesions. Other evaluated factors were not associated with recurrence.
Conclusion:
The presence of VFL lesions on the inferior surface is a significant recurrence risk factor
Generalizable Zero-Shot Speaker Adaptive Speech Synthesis with Disentangled Representations
While most research into speech synthesis has focused on synthesizing
high-quality speech for in-dataset speakers, an equally essential yet unsolved
problem is synthesizing speech for unseen speakers who are out-of-dataset with
limited reference data, i.e., speaker adaptive speech synthesis. Many studies
have proposed zero-shot speaker adaptive text-to-speech and voice conversion
approaches aimed at this task. However, most current approaches suffer from the
degradation of naturalness and speaker similarity when synthesizing speech for
unseen speakers (i.e., speakers not in the training dataset) due to the poor
generalizability of the model in out-of-distribution data. To address this
problem, we propose GZS-TV, a generalizable zero-shot speaker adaptive
text-to-speech and voice conversion model. GZS-TV introduces disentangled
representation learning for both speaker embedding extraction and timbre
transformation to improve model generalization and leverages the representation
learning capability of the variational autoencoder to enhance the speaker
encoder. Our experiments demonstrate that GZS-TV reduces performance
degradation on unseen speakers and outperforms all baseline models in multiple
datasets.Comment: 5 pages, 3 figures. Accepted by Interspeech 2023, Ora
AutoLV: Automatic Lecture Video Generator
We propose an end-to-end lecture video generation system that can generate
realistic and complete lecture videos directly from annotated slides,
instructor's reference voice and instructor's reference portrait video. Our
system is primarily composed of a speech synthesis module with few-shot speaker
adaptation and an adversarial learning-based talking-head generation module. It
is capable of not only reducing instructors' workload but also changing the
language and accent which can help the students follow the lecture more easily
and enable a wider dissemination of lecture contents. Our experimental results
show that the proposed model outperforms other current approaches in terms of
authenticity, naturalness and accuracy. Here is a video demonstration of how
our system works, and the outcomes of the evaluation and comparison:
https://youtu.be/cY6TYkI0cog.Comment: 4 pages, 4 figures, ICIP 202
- …